Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 50000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 90 |
| Duplicate rows (%) | 0.2% |
| Total size in memory | 5.8 MiB |
| Average record size in memory | 121.0 B |
Variable types
| Numeric | 10 |
|---|---|
| Boolean | 1 |
| Categorical | 5 |
month signup_date has constant value "1" | Constant |
| Dataset has 90 (0.2%) duplicate rows | Duplicates |
date_delta is highly correlated with month last_trip_date | High correlation |
month last_trip_date is highly correlated with date_delta | High correlation |
month signup_date is highly correlated with luxury_car_user and 4 other fields | High correlation |
luxury_car_user is highly correlated with month signup_date | High correlation |
city_2 is highly correlated with month signup_date | High correlation |
city_0 is highly correlated with month signup_date | High correlation |
active is highly correlated with month signup_date | High correlation |
city_1 is highly correlated with month signup_date | High correlation |
phone has 15022 (30.0%) zeros | Zeros |
surge_pct has 34409 (68.8%) zeros | Zeros |
trips_in_first_30_days has 15390 (30.8%) zeros | Zeros |
weekday_pct has 9203 (18.4%) zeros | Zeros |
date_delta has 2302 (4.6%) zeros | Zeros |
Reproduction
| Analysis started | 2021-03-12 19:44:54.573245 |
|---|---|
| Analysis finished | 2021-03-12 19:45:52.452999 |
| Duration | 57.88 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
avg_dist
Real number (ℝ≥0)
| Distinct | 2908 |
|---|---|
| Distinct (%) | 5.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.7968266 |
|---|---|
| Minimum | 0 |
| Maximum | 160.96 |
| Zeros | 150 |
| Zeros (%) | 0.3% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1.2 |
| Q1 | 2.42 |
| median | 3.88 |
| Q3 | 6.94 |
| 95-th percentile | 16.78 |
| Maximum | 160.96 |
| Range | 160.96 |
| Interquartile range (IQR) | 4.52 |
Descriptive statistics
| Standard deviation | 5.707356703 |
|---|---|
| Coefficient of variation (CV) | 0.984565711 |
| Kurtosis | 29.19171296 |
| Mean | 5.7968266 |
| Median Absolute Deviation (MAD) | 1.82 |
| Skewness | 3.464170294 |
| Sum | 289841.33 |
| Variance | 32.57392054 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 150 | 0.3% |
| 2.3 | 116 | 0.2% |
| 2.29 | 116 | 0.2% |
| 2.36 | 114 | 0.2% |
| 2.7 | 114 | 0.2% |
| 2.73 | 114 | 0.2% |
| 2.65 | 113 | 0.2% |
| 2.5 | 113 | 0.2% |
| 2.4 | 110 | 0.2% |
| 2.54 | 110 | 0.2% |
| Other values (2898) | 48830 |
| Value | Count | Frequency (%) |
| 0 | 150 | |
| 0.01 | 38 | 0.1% |
| 0.02 | 14 | < 0.1% |
| 0.03 | 6 | < 0.1% |
| 0.04 | 12 | < 0.1% |
| Value | Count | Frequency (%) |
| 160.96 | 1 | |
| 129.89 | 1 | |
| 79.69 | 1 | |
| 79.34 | 1 | |
| 77.13 | 1 |
avg_rating_by_driver
Real number (ℝ≥0)
| Distinct | 28 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.778158196 |
|---|---|
| Minimum | 1 |
| Maximum | 5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 4.7 |
| median | 5 |
| Q3 | 5 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 0.3 |
Descriptive statistics
| Standard deviation | 0.4457531013 |
|---|---|
| Coefficient of variation (CV) | 0.09328973278 |
| Kurtosis | 24.33824456 |
| Mean | 4.778158196 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -4.137232874 |
| Sum | 238907.9098 |
| Variance | 0.1986958273 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 28508 | |
| 4.8 | 4537 | 9.1% |
| 4.7 | 3330 | 6.7% |
| 4.9 | 3094 | 6.2% |
| 4.5 | 2424 | 4.8% |
| 4.6 | 2078 | 4.2% |
| 4 | 1914 | 3.8% |
| 4.3 | 1018 | 2.0% |
| 4.4 | 860 | 1.7% |
| 3 | 602 | 1.2% |
| Other values (18) | 1635 | 3.3% |
| Value | Count | Frequency (%) |
| 1 | 181 | |
| 1.5 | 4 | < 0.1% |
| 2 | 126 | |
| 2.3 | 1 | < 0.1% |
| 2.5 | 31 | 0.1% |
| Value | Count | Frequency (%) |
| 5 | 28508 | |
| 4.9 | 3094 | 6.2% |
| 4.8 | 4537 | 9.1% |
| 4.778158196 | 201 | 0.4% |
| 4.7 | 3330 | 6.7% |
avg_rating_of_driver
Real number (ℝ≥0)
| Distinct | 38 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.601559291 |
|---|---|
| Minimum | 1 |
| Maximum | 5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3.5 |
| Q1 | 4.5 |
| median | 4.7 |
| Q3 | 5 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 0.5 |
Descriptive statistics
| Standard deviation | 0.5649765903 |
|---|---|
| Coefficient of variation (CV) | 0.1227793786 |
| Kurtosis | 10.2979159 |
| Mean | 4.601559291 |
| Median Absolute Deviation (MAD) | 0.3 |
| Skewness | -2.653535617 |
| Sum | 230077.9646 |
| Variance | 0.3191985475 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 20771 | |
| 4.601559291 | 8122 | 16.2% |
| 4 | 4193 | 8.4% |
| 4.5 | 2498 | 5.0% |
| 4.8 | 2430 | 4.9% |
| 4.7 | 1945 | 3.9% |
| 4.9 | 1771 | 3.5% |
| 4.3 | 1487 | 3.0% |
| 4.6 | 1143 | 2.3% |
| 3 | 1003 | 2.0% |
| Other values (28) | 4637 | 9.3% |
| Value | Count | Frequency (%) |
| 1 | 256 | |
| 1.5 | 4 | < 0.1% |
| 1.6 | 1 | < 0.1% |
| 1.7 | 2 | < 0.1% |
| 1.8 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 5 | 20771 | |
| 4.9 | 1771 | 3.5% |
| 4.8 | 2430 | 4.9% |
| 4.7 | 1945 | 3.9% |
| 4.601559291 | 8122 | 16.2% |
avg_surge
Real number (ℝ≥0)
| Distinct | 115 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0747638 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1.05 |
| 95-th percentile | 1.38 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0.05 |
Descriptive statistics
| Standard deviation | 0.2223360089 |
|---|---|
| Coefficient of variation (CV) | 0.206869648 |
| Kurtosis | 77.28146676 |
| Mean | 1.0747638 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.821346191 |
| Sum | 53738.19 |
| Variance | 0.04943330088 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 34454 | |
| 1.25 | 1100 | 2.2% |
| 1.13 | 956 | 1.9% |
| 1.02 | 809 | 1.6% |
| 1.08 | 798 | 1.6% |
| 1.04 | 774 | 1.5% |
| 1.06 | 770 | 1.5% |
| 1.05 | 704 | 1.4% |
| 1.03 | 619 | 1.2% |
| 1.07 | 616 | 1.2% |
| Other values (105) | 8400 | 16.8% |
| Value | Count | Frequency (%) |
| 1 | 34454 | |
| 1.01 | 484 | 1.0% |
| 1.02 | 809 | 1.6% |
| 1.03 | 619 | 1.2% |
| 1.04 | 774 | 1.5% |
| Value | Count | Frequency (%) |
| 8 | 1 | < 0.1% |
| 5.75 | 1 | < 0.1% |
| 5 | 5 | |
| 4.75 | 1 | < 0.1% |
| 4.5 | 4 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.69732 |
|---|---|
| Minimum | 0 |
| Maximum | 1 |
| Zeros | 15022 |
| Zeros (%) | 30.0% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.4578945413 |
|---|---|
| Coefficient of variation (CV) | 0.6566490869 |
| Kurtosis | -1.251599269 |
| Mean | 0.69732 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -0.8613514419 |
| Sum | 34866 |
| Variance | 0.2096674109 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 34624 | |
| 0 | 15022 | |
| 0.8 | 184 | 0.4% |
| 0.6 | 136 | 0.3% |
| 0.4 | 32 | 0.1% |
| 0.2 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 15022 | |
| 0.2 | 2 | < 0.1% |
| 0.4 | 32 | 0.1% |
| 0.6 | 136 | 0.3% |
| 0.8 | 184 | 0.4% |
| Value | Count | Frequency (%) |
| 1 | 34624 | |
| 0.8 | 184 | 0.4% |
| 0.6 | 136 | 0.3% |
| 0.4 | 32 | 0.1% |
| 0.2 | 2 | < 0.1% |
| Distinct | 367 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.849536 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros | 34409 |
| Zeros (%) | 68.8% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 8.6 |
| 95-th percentile | 50 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 8.6 |
Descriptive statistics
| Standard deviation | 19.9588109 |
|---|---|
| Coefficient of variation (CV) | 2.255351117 |
| Kurtosis | 10.43684717 |
| Mean | 8.849536 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.14412393 |
| Sum | 442476.8 |
| Variance | 398.3541325 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 34409 | |
| 100 | 1416 | 2.8% |
| 50 | 1367 | 2.7% |
| 33.3 | 1152 | 2.3% |
| 25 | 906 | 1.8% |
| 20 | 790 | 1.6% |
| 16.7 | 708 | 1.4% |
| 14.3 | 533 | 1.1% |
| 12.5 | 439 | 0.9% |
| 11.1 | 393 | 0.8% |
| Other values (357) | 7887 | 15.8% |
| Value | Count | Frequency (%) |
| 0 | 34409 | |
| 0.4 | 1 | < 0.1% |
| 0.5 | 3 | < 0.1% |
| 0.6 | 1 | < 0.1% |
| 0.7 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 1416 | |
| 85.7 | 2 | < 0.1% |
| 83.3 | 3 | < 0.1% |
| 80 | 11 | < 0.1% |
| 75 | 34 | 0.1% |
| Distinct | 59 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.2782 |
|---|---|
| Minimum | 0 |
| Maximum | 125 |
| Zeros | 15390 |
| Zeros (%) | 30.8% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 9 |
| Maximum | 125 |
| Range | 125 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 3.792684069 |
|---|---|
| Coefficient of variation (CV) | 1.664772219 |
| Kurtosis | 56.57119678 |
| Mean | 2.2782 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 5.167754879 |
| Sum | 113910 |
| Variance | 14.38445245 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 15390 | |
| 1 | 14108 | |
| 2 | 7402 | |
| 3 | 3788 | 7.6% |
| 4 | 2562 | 5.1% |
| 5 | 1616 | 3.2% |
| 6 | 1134 | 2.3% |
| 7 | 819 | 1.6% |
| 8 | 589 | 1.2% |
| 9 | 471 | 0.9% |
| Other values (49) | 2121 | 4.2% |
| Value | Count | Frequency (%) |
| 0 | 15390 | |
| 1 | 14108 | |
| 2 | 7402 | |
| 3 | 3788 | 7.6% |
| 4 | 2562 | 5.1% |
| Value | Count | Frequency (%) |
| 125 | 1 | |
| 73 | 1 | |
| 71 | 1 | |
| 63 | 1 | |
| 58 | 1 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 49.0 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) |
| False | 31146 | |
| True | 18854 |
| Distinct | 666 |
|---|---|
| Distinct (%) | 1.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 60.926084 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros | 9203 |
| Zeros (%) | 18.4% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 33.3 |
| median | 66.7 |
| Q3 | 100 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 66.7 |
Descriptive statistics
| Standard deviation | 37.08150341 |
|---|---|
| Coefficient of variation (CV) | 0.6086309996 |
| Kurtosis | -1.154187819 |
| Mean | 60.926084 |
| Median Absolute Deviation (MAD) | 33.3 |
| Skewness | -0.4777875001 |
| Sum | 3046304.2 |
| Variance | 1375.037895 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 100 | 16659 | |
| 0 | 9203 | |
| 50 | 4057 | 8.1% |
| 66.7 | 2088 | 4.2% |
| 33.3 | 1619 | 3.2% |
| 75 | 1104 | 2.2% |
| 60 | 772 | 1.5% |
| 25 | 723 | 1.4% |
| 80 | 668 | 1.3% |
| 40 | 593 | 1.2% |
| Other values (656) | 12514 |
| Value | Count | Frequency (%) |
| 0 | 9203 | |
| 4 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 5.9 | 1 | < 0.1% |
| 6.3 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 16659 | |
| 99 | 1 | < 0.1% |
| 98.9 | 2 | < 0.1% |
| 98.5 | 1 | < 0.1% |
| 98.4 | 2 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 390.8 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 50000 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 0 | 31690 | |
| 1 | 18310 |
| Value | Count | Frequency (%) |
| 0 | 31690 | |
| 1 | 18310 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 31690 | |
| 1 | 18310 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 50000 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 31690 | |
| 1 | 18310 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 50000 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 31690 | |
| 1 | 18310 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 50000 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 31690 | |
| 1 | 18310 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 390.8 KiB |
| 0.0 | |
|---|---|
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 150000 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
| Value | Count | Frequency (%) |
| 0.0 | 33466 | |
| 1.0 | 16534 |
| Value | Count | Frequency (%) |
| 0.0 | 33466 | |
| 1.0 | 16534 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 83466 | |
| . | 50000 | |
| 1 | 16534 | 11.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 | |
| Other Punctuation | 50000 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 83466 | |
| 1 | 16534 | 16.5% |
| Value | Count | Frequency (%) |
| . | 50000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 150000 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 83466 | |
| . | 50000 | |
| 1 | 16534 | 11.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 150000 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 83466 | |
| . | 50000 | |
| 1 | 16534 | 11.0% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 390.8 KiB |
| 0.0 | |
|---|---|
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 150000 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 1.0 |
| 5th row | 0.0 |
| Value | Count | Frequency (%) |
| 0.0 | 39870 | |
| 1.0 | 10130 | 20.3% |
| Value | Count | Frequency (%) |
| 0.0 | 39870 | |
| 1.0 | 10130 | 20.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 89870 | |
| . | 50000 | |
| 1 | 10130 | 6.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 | |
| Other Punctuation | 50000 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 89870 | |
| 1 | 10130 | 10.1% |
| Value | Count | Frequency (%) |
| . | 50000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 150000 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 89870 | |
| . | 50000 | |
| 1 | 10130 | 6.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 150000 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 89870 | |
| . | 50000 | |
| 1 | 10130 | 6.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 390.8 KiB |
| 0.0 | |
|---|---|
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 150000 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 1.0 |
| Value | Count | Frequency (%) |
| 0.0 | 26664 | |
| 1.0 | 23336 |
| Value | Count | Frequency (%) |
| 0.0 | 26664 | |
| 1.0 | 23336 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 76664 | |
| . | 50000 | |
| 1 | 23336 | 15.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 100000 | |
| Other Punctuation | 50000 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 76664 | |
| 1 | 23336 | 23.3% |
| Value | Count | Frequency (%) |
| . | 50000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 150000 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 76664 | |
| . | 50000 | |
| 1 | 23336 | 15.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 150000 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 76664 | |
| . | 50000 | |
| 1 | 23336 | 15.6% |
| Distinct | 182 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 92.7901 |
|---|---|
| Minimum | 0 |
| Maximum | 181 |
| Zeros | 2302 |
| Zeros (%) | 4.6% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 27 |
| median | 110 |
| Q3 | 150 |
| 95-th percentile | 170 |
| Maximum | 181 |
| Range | 181 |
| Interquartile range (IQR) | 123 |
Descriptive statistics
| Standard deviation | 62.12982154 |
|---|---|
| Coefficient of variation (CV) | 0.6695738181 |
| Kurtosis | -1.438478171 |
| Mean | 92.7901 |
| Median Absolute Deviation (MAD) | 48 |
| Skewness | -0.3237976574 |
| Sum | 4639505 |
| Variance | 3860.114724 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 4374 | 8.7% |
| 0 | 2302 | 4.6% |
| 2 | 1063 | 2.1% |
| 155 | 756 | 1.5% |
| 154 | 687 | 1.4% |
| 3 | 595 | 1.2% |
| 153 | 595 | 1.2% |
| 162 | 584 | 1.2% |
| 156 | 579 | 1.2% |
| 148 | 575 | 1.1% |
| Other values (172) | 37890 |
| Value | Count | Frequency (%) |
| 0 | 2302 | |
| 1 | 4374 | |
| 2 | 1063 | 2.1% |
| 3 | 595 | 1.2% |
| 4 | 433 | 0.9% |
| Value | Count | Frequency (%) |
| 181 | 13 | < 0.1% |
| 180 | 72 | 0.1% |
| 179 | 143 | |
| 178 | 152 | |
| 177 | 202 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 390.8 KiB |
| 1 |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 50000 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 1 | 50000 |
| Value | Count | Frequency (%) |
| 1 | 50000 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 50000 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 50000 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 50000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 50000 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 50000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 50000 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 50000 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.04232 |
|---|---|
| Minimum | 1 |
| Maximum | 7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 390.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 5 |
| Q3 | 6 |
| 95-th percentile | 6 |
| Maximum | 7 |
| Range | 6 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 1.992879437 |
|---|---|
| Coefficient of variation (CV) | 0.4930038781 |
| Kurtosis | -1.394615843 |
| Mean | 4.04232 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.4290588893 |
| Sum | 202116 |
| Variance | 3.971568449 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 18256 | |
| 1 | 10147 | |
| 5 | 7585 | |
| 4 | 4588 | 9.2% |
| 3 | 4568 | 9.1% |
| 2 | 4308 | 8.6% |
| 7 | 548 | 1.1% |
| Value | Count | Frequency (%) |
| 1 | 10147 | |
| 2 | 4308 | |
| 3 | 4568 | |
| 4 | 4588 | |
| 5 | 7585 |
| Value | Count | Frequency (%) |
| 7 | 548 | 1.1% |
| 6 | 18256 | |
| 5 | 7585 | |
| 4 | 4588 | 9.2% |
| 3 | 4568 | 9.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| avg_dist | avg_rating_by_driver | avg_rating_of_driver | avg_surge | phone | surge_pct | trips_in_first_30_days | luxury_car_user | weekday_pct | active | city_0 | city_1 | city_2 | date_delta | month signup_date | month last_trip_date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3.67 | 5.0 | 4.700000 | 1.10 | 1.0 | 15.4 | 4 | True | 46.2 | 1 | 0.0 | 1.0 | 0.0 | 143 | 1 | 6 |
| 1 | 8.26 | 5.0 | 5.000000 | 1.00 | 0.0 | 0.0 | 0 | False | 50.0 | 0 | 1.0 | 0.0 | 0.0 | 96 | 1 | 5 |
| 2 | 0.77 | 5.0 | 4.300000 | 1.00 | 1.0 | 0.0 | 3 | False | 100.0 | 0 | 1.0 | 0.0 | 0.0 | 1 | 1 | 1 |
| 3 | 2.36 | 4.9 | 4.600000 | 1.14 | 1.0 | 20.0 | 9 | True | 80.0 | 1 | 0.0 | 1.0 | 0.0 | 170 | 1 | 6 |
| 4 | 3.13 | 4.9 | 4.400000 | 1.19 | 0.0 | 11.8 | 14 | False | 82.4 | 0 | 0.0 | 0.0 | 1.0 | 47 | 1 | 3 |
| 5 | 10.56 | 5.0 | 3.500000 | 1.00 | 1.0 | 0.0 | 2 | True | 100.0 | 1 | 0.0 | 0.0 | 1.0 | 148 | 1 | 6 |
| 6 | 3.95 | 4.0 | 4.601559 | 1.00 | 0.0 | 0.0 | 1 | False | 100.0 | 0 | 1.0 | 0.0 | 0.0 | 1 | 1 | 1 |
| 7 | 2.04 | 5.0 | 5.000000 | 1.00 | 1.0 | 0.0 | 2 | False | 100.0 | 0 | 0.0 | 0.0 | 1.0 | 1 | 1 | 1 |
| 8 | 4.36 | 5.0 | 4.500000 | 1.00 | 0.0 | 0.0 | 2 | False | 100.0 | 0 | 0.0 | 0.0 | 1.0 | 11 | 1 | 2 |
| 9 | 2.37 | 5.0 | 4.601559 | 1.00 | 0.0 | 0.0 | 1 | False | 0.0 | 0 | 0.0 | 0.0 | 1.0 | 2 | 1 | 1 |
Last rows
| avg_dist | avg_rating_by_driver | avg_rating_of_driver | avg_surge | phone | surge_pct | trips_in_first_30_days | luxury_car_user | weekday_pct | active | city_0 | city_1 | city_2 | date_delta | month signup_date | month last_trip_date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 49990 | 3.38 | 5.0 | 4.700000 | 1.08 | 1.0 | 33.3 | 1 | True | 33.3 | 0 | 1.0 | 0.0 | 0.0 | 125 | 1 | 5 |
| 49991 | 1.06 | 5.0 | 5.000000 | 1.25 | 1.0 | 100.0 | 0 | False | 0.0 | 1 | 0.0 | 0.0 | 1.0 | 172 | 1 | 6 |
| 49992 | 7.58 | 5.0 | 1.000000 | 1.00 | 1.0 | 0.0 | 1 | False | 0.0 | 0 | 0.0 | 1.0 | 0.0 | 1 | 1 | 1 |
| 49993 | 2.53 | 4.7 | 4.800000 | 1.11 | 1.0 | 11.1 | 3 | True | 55.6 | 1 | 1.0 | 0.0 | 0.0 | 179 | 1 | 7 |
| 49994 | 2.25 | 4.5 | 4.600000 | 1.44 | 1.0 | 37.5 | 1 | False | 25.0 | 0 | 1.0 | 0.0 | 0.0 | 148 | 1 | 5 |
| 49995 | 5.63 | 4.2 | 5.000000 | 1.00 | 1.0 | 0.0 | 0 | False | 100.0 | 1 | 0.0 | 1.0 | 0.0 | 131 | 1 | 6 |
| 49996 | 0.00 | 4.0 | 4.601559 | 1.00 | 1.0 | 0.0 | 1 | False | 0.0 | 0 | 1.0 | 0.0 | 0.0 | 1 | 1 | 1 |
| 49997 | 3.86 | 5.0 | 5.000000 | 1.00 | 0.0 | 0.0 | 0 | True | 100.0 | 0 | 0.0 | 0.0 | 1.0 | 111 | 1 | 5 |
| 49998 | 4.58 | 3.5 | 3.000000 | 1.00 | 1.0 | 0.0 | 2 | False | 100.0 | 0 | 1.0 | 0.0 | 0.0 | 1 | 1 | 1 |
| 49999 | 3.49 | 5.0 | 4.601559 | 1.00 | 0.0 | 0.0 | 0 | False | 0.0 | 0 | 1.0 | 0.0 | 0.0 | 92 | 1 | 4 |
Most frequent
| avg_dist | avg_rating_by_driver | avg_rating_of_driver | avg_surge | phone | surge_pct | trips_in_first_30_days | luxury_car_user | weekday_pct | active | city_0 | city_1 | city_2 | date_delta | month signup_date | month last_trip_date | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6 | 0.00 | 5.0 | 5.000000 | 1.0 | 0.0 | 0.0 | 1 | False | 100.0 | 0 | 0.0 | 0.0 | 1.0 | 1 | 1 | 1 | 4 |
| 4 | 0.00 | 5.0 | 4.601559 | 1.0 | 1.0 | 0.0 | 1 | False | 100.0 | 0 | 0.0 | 0.0 | 1.0 | 1 | 1 | 1 | 3 |
| 0 | 0.00 | 1.0 | 4.601559 | 1.0 | 0.0 | 0.0 | 1 | False | 100.0 | 0 | 0.0 | 0.0 | 1.0 | 0 | 1 | 1 | 2 |
| 1 | 0.00 | 5.0 | 1.000000 | 1.0 | 0.0 | 0.0 | 1 | False | 0.0 | 0 | 0.0 | 0.0 | 1.0 | 0 | 1 | 1 | 2 |
| 2 | 0.00 | 5.0 | 4.601559 | 1.0 | 0.0 | 0.0 | 1 | False | 100.0 | 0 | 0.0 | 0.0 | 1.0 | 1 | 1 | 1 | 2 |
| 3 | 0.00 | 5.0 | 4.601559 | 1.0 | 1.0 | 0.0 | 1 | False | 0.0 | 0 | 1.0 | 0.0 | 0.0 | 1 | 1 | 1 | 2 |
| 5 | 0.00 | 5.0 | 4.601559 | 1.0 | 1.0 | 0.0 | 1 | False | 100.0 | 0 | 1.0 | 0.0 | 0.0 | 1 | 1 | 1 | 2 |
| 7 | 0.00 | 5.0 | 5.000000 | 1.0 | 1.0 | 0.0 | 1 | False | 100.0 | 0 | 0.0 | 0.0 | 1.0 | 1 | 1 | 1 | 2 |
| 8 | 0.00 | 5.0 | 5.000000 | 1.0 | 1.0 | 0.0 | 1 | False | 100.0 | 0 | 1.0 | 0.0 | 0.0 | 0 | 1 | 1 | 2 |
| 9 | 0.01 | 5.0 | 4.601559 | 1.0 | 0.0 | 0.0 | 1 | False | 0.0 | 0 | 0.0 | 0.0 | 1.0 | 1 | 1 | 1 | 2 |